Page 1 of 8 



1652 





5 




7 




9 




10 




12 




14 




15 




16 




17 




18 




19 




20 




21 




23 




24 




25 




26 




27 




29 


c--> 


30 


c--> 


31 




32 




34 


c--> 


35 


C--> 


36 




40 




41 




42 




43 




45 




46 




47 




48 




51 




53 




54 




55 




56 




57 




59 




61 




63 




65 




66 


c--> 


68 




70 



RAW SEQUENCE LISTING DATE: 10/19/2000 

PATENT APPLICATION: US/09/308,207 TIME: 08:34:10 

Input Set : A:\ES.txt 

Output Set: N:\CRF3\10192000\I308207.raw 

SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: MARIA DIAZ -TORRES ET AL . 
(ii) TITLE OF INVENTION: METHOD FOR THE RECOMBINANT 

PRODUCTION OF 1,3 PROPANEDIOL 
(iii) NUMBER OF SEQUENCES: 68 
(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Genencor International, Inc. 

(B) STREET: 4 Cambridge Place 

1870 South Winton road 

(C) CITY: Rochester 

(D) STATE: NY 

(E) COUNTRY: U.S. A 
, (F) ZIP: 14618 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: Windows 

(D) SOFTWARE: FastSEQ for Windows Version 2.0b 
(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US/09/308,207. 

(B) FILING DATE: 13-May-1999 

(C) CLASSIFICATION: 
(Vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 60/030,601 

(B) FILING DATE: 13-NOV-1996 
(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Glaister, Debra 

(B) REGISTRATION NUMBER: 33,888 

(C) REFERENCE/DOCKET NUMBER: GC 369-2 
(ix) TELECOMMUNICATION INFORMATION:. 

(A) TELEPHONE: 650-864-7620 

(B) TELEFAX: 650-845-6504 

(C) TELEX: 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1668 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: DHABI 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
ATGAAAAGAT CAAAACGATT TGCAGTACTG GCCCAGCGCC CCGTCAATCA GGACGGGCTG 60 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/308, 207 



DATE: 10/19/2000 
TIME: 08:34:10 



input Set : A:\ES.txt 

Output Set: N:\CRF3\10192000\I308207.raw 



72 ATTGGCGAGT GGCCTGAAGA GGGGCTGATC GCCATGGACA GCCCCTTTGA CCCGGTCTCT 120 

74 TCAGTAAAAG TGGACAACGG TCTGATCGTC GAACTGGACG GCAAACGCCG GGACCAGTTT 180 

7 6 GACATGATCG ACCGATTTAT CGCCGATTAC GCGATCAACG TTGAGCGCAC AGAGCAGGCA 240 
78 ATGCGCCTGG AGGCGGTGGA AATAGCCCGT ATGCTGGTGG ATATTCACGT CAGCCGGGAG 300 
80 GAGATCATTG CCATCACTAC CGCCATCACG CCGGCCAAAG CGGTCGAGGT GATGGCGCAG 360 
82 ATGAACGTGG TGGAGATGAT GATGGCGCTG CAGAAGATGC GTGCCCGCCG GACCCCCTCC 420 
84 AACCAGTGCC ACGTCACCAA TCTCAAAGAT AATCCGGTGC AGATTGCCGC TGACGCCGCC 4 80 

8 6 GAGGCCGGGA TCCGCGGCTT CTCAGAACAG GAGACCACGG TCGGTATCGC GCGCTACGCG 54 0 
88 CCGTTTAACG CCCTGGCGCT GTTGGTCGGT TCGCAGTGCG GCCGCCCCGG CGTGTTGACG 600 
90 CAGTGCTCGG TGGAAGAGGC CACCGAGCTG GAGCTGGGCA TGCGTGGCTT AACCAGCTAC 660 
92 GCCGAGACGG TGTCGGTCTA CGGCACCGAA GCGGTATTTA CCGACGGCGA TGATACGCCG 720 
94 TGGTCAAAGG CGTTCCTCGC CTCGGCCTAC GCCTCCCGCG GGTTGAAAAT GCGCTACACC 780 
96 TCCGGCACCG GATCCGAAGC GCTGATGGGC TATTCGGAGA GCAAGTCGAT GCTCTACCTC 84 0 
98 GAATCGCGCT GCATCTTCAT TACTAAAGGC GCCGGGGTTC AGGGACTGCA AAACGGCGCG 900 
100 GTGAGCTGTA TCGGCATGAC CGGCGCTGTG CCGTCGGGCA TTCGGGCGGT GCTGGCGGAA 960 
102 AACCTGATCG CCTCTATGCT CGACCTCGAA GTGGCGTCCG CCAACGACCA GACTTTCTCC 1020 
104 CACTCGGATA TTCGCCGCAC CGCGCGCACC CTGATGCAGA TGCTGCCGGG CACCGACTTT 1080 
106 ATTTTCTCCG GCTACAGCGC GGTGCCGAAC TACGACAACA TGTTCGCCGG CTCGAACTTC 1140 
108 GATGCGGAAG ATTTTGATGA TTACAACATC CTGCAGCGTG ACCTGATGGT TGACGGCGGC 1200 
110 CTGCGTCCGG TGACCGAGGC GGAAACCATT GCCATTCGCC AGAAAGCGGC GCGGGCGATC 1260 
112 CAGGCGGTTT TCCGCGAGCT GGGGCTGCCG CCAATCGCCG ACGAGGAGGT GGAGGCCGCC 1320 
114 ACCTACGCGC ACGGCAGCAA CGAGATGCCG CCGCGTAACG TGGTGGAGGA TCTGAGTGCG 13 80 
116 GTGGAAGAGA TGATGAAGCG CAACATCACC GGCCTCGATA TTGTCGGCGC GCTGAGCCGC 1440 
118 AGCGGCTTTG AGGATATCGC CAGCAATATT CTCAATATGC TGCGCCAGCG GGTCACCGGC 1500 
120 GATTACCTGC AGACCTCGGC CATTCTCGAT CGGCAGTTCG AGGTGGTGAG TGCGGTCAAC 1560 
122 GACATCAATG ACTATCAGGG GCCGGGCACC GGCTATCGCA TCTCTGCCGA ACGCTGGGCG 1620 
124 GAGATCAAAA ATATTCCGGG CGTGGTTCAG CCCGACACCA TTGAATAA 1668 
126 (2) INFORMATION FOR SEQ ID NO: 2: 

128 (i) 'SEQUENCE CHARACTERISTICS: 

129 (A) LENGTH: 585 base pairs 

130 (B) TYPE: nucleic acid 

131 (C) STRANDEDNESS : single 

132 (D) TOPOLOGY: linear 

134 (ii) MOLECULE TYPE: DNA (genomic) 

136 (vi) ORIGINAL SOURCE: 

137 (A) ORGANISM: DHAB2 

C--> 139 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

141 GTGCAACAGA CAACCCAAAT TCAGCCCTCT TTTACCCTGA AAACCCGCGA GGGCGGGGTA 60 

143 GCTTCTGCCG ATGAACGCGC CGATGAAGTG GTGATCGGCG TCGGCCCTGC CTTCGATAAA 120 

145 CACCAGCATC ACACTCTGAT CGATATGCCC CATGGCGCGA TCCTCAAAGA GCTGATTGCC 180 

147 GGGGTGGAAG AAGAGGGGCT TCACGCCCGG GTGGTGCGCA TTCTGCGCAC GTCCGACGTC 24 0 

149 TCCTTTATGG CCTGGGATGC GGCCAACCTG AGCGGCTCGG GGATCGGCAT CGGTATCCAG 3 00 

151 TCGAAGGGGA CCACGGTCAT CCATCAGCGC GATCTGCTGC CGCTCAGCAA CCTGGAGCTG 3 60 

153 TTCTCCCAGG CGCCGCTGCT GACGCTGGAG ACCTACCGGC AGATTGGCAA AAACGCTGCG 4 20 

155 CGCTATGCGC GCAAAGAGTC ACCTTCGCCG GTGCCGGTGG TGAACGATCA GATGGTGCGG 4 80 

157 CCGAAATTTA TGGCCAAAGC CGCGCTATTT CATATCAAAG AGACCAAACA TGTGGTGCAG 54 0 

159 GACGCCGAGC CCGTCACCCT GCACATCGAC TTAGTAAGGG AGTGA 585 
161 (2) INFORMATION FOR SEQ ID NO : 3: 
163 (i) SEQUENCE CHARACTERISTICS: 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/308,207 



DATE: 
TIME: 



10/19/2000 
08:34:10 



164 
165 
166 
167 
169 
171 
172 

C--> 174 
176 
178 
180 
182 
184 
186 
188 
190 
192 
194 
195 
196 
197 
198 
200 
202 
203 

C--> 205 
207 
209 
211 
213 
215 
217 
219 
221 
223 
225 
227 
229 
231 
233 
235 
237 
239 
241 
243 
245 
247 
249 
250 



Input Set : A:\ES.txt 

Output Set: N:\CRF3\10192000\l308207.raw 



(A) LENGTH : 426 base pairs 

(B) TYPE: nucleic acid 

i (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: DHAB3 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
ATGAGCGAGA AAACCATGCG CGTGCAGGAT TATCCGTTAG 
ATCCTGACGC CTACCGGCAA ACCATTGACC GATATTACCC 
GAGGTGGGCC CGCAGGATGT GCGGATCTCC CGCCAGACCC 
GCCGAGCAGA TGCAGCGCCA TGCGGTGGCG CGCAATTTCC 
GCCATTCCTG ACGAGCGCAT TCTGGCTATC TATAACGCGC 
CAGGCGGAGC TGCTGGCGAT CGCCGACGAG CTGGAGCACA 
GCCGCCTTTG TCCGGGAGTC GGCGGAAGTG TATCAGCAGC 
AGCTAA 

(2) INFORMATION FOR SEQ ID NO: 4: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1164 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(vi) ORIGINAL SOURCE : 

(A) ORGANISM: DHAT 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
ATGAGCTATC GTATGTTTGA TTATCTGGTG CCAAACGTTA 



CCACCCGCTG 
TCGAGAAGGT 
TTGAGTACCA 
GCCGCGCGGC 
TGCGCCCGTT 
CCTGGCATGC 
GGCATAAGCT 



CCCGGAGCAT 
GCTCTCTGGC 
GGCGCAGATT 
GGAGCTTATC 
CCGCTCCTCG 
GACAGTGAAT 
GCGTAAAGGA 



ATTTCCGTAG TCGGCGAACG CTGCCAGCTG CTGGGGGGGA 
GACAAAGGCC TGCGGGCAAT TAAAGATGGC GCGGTGGACA 
GAGGCCGGGA TCGAGGTGGC GATCTTTGAC GGCGTCGAGC 
GTGCGCGACG GCCTCGCCGT GTTTCGCCGC GAACAGTGCG 
GGCGGCAGCC CGCACGATTG CGGCAAAGGC ATCGGCATCG 
CTGTACCAGT ATGCCGGAAT CGAGACCCTG ACCAACCCGC 
AATACCACCG CCGGCACCGC CAGCGAGGTC ACCCGCCACT 
ACCAAAGTGA AGTTTGTGAT CGTCAGCTGG CGCAAACTGC 
CCACTGCTGA TGATCGGTAA ACCGGCCGCC CTGACCGCGG 
ACCCACGCCG TAGAGGCCTA TATCTCCAAA GACGCTAACC 
ATGCAGGCGA TCCGCCTCAT CGCCCGCAAC CTGCGCCAGG 
CTGCAGGCGC GGGAAAACAT GGCCTATGCT TCTCTGCTGG 
GCCAACCTCG GCTACGTGCA CGCCATGGCG CACCAGCTGG 
CACGGCGTGG CCAACGCTGT CCTGCTGCCG CATGTGGCGC 
CCGGAGAAAT TCGCCGATAT CGCTGAACTG ATGGGCGAAA 
CTCGACGCGG CGGAAAAAGC CATCGCCGCT ATCACGCGTC 
CCGCAGCATC TGCGCGATCT GGGGGTAAAA GAGGCCGACT 
GCTCTAAAAG ACGGCAATGC GTTCTCGAAC CCGCGTAAAG 
GCGATTTTCC GCCAGGCATT CTGA 
(2) INFORMATION FOR SEQ ID NO: 5: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1380 base pairs 



ACTTTTTTGG 
AAAAAGCCCT 
AAACCCTGCA 
CGAACCCGAA 
ACATCATCGT 
CCGCCACCCA 
TGCCGCCTAT 
GCGTCCTGAC 
CGTCGGTCTC 
CGACCGGGAT 
CGGTGACGGA 
CCGTGGCCCT 
CCGGGATGGC 
GCGGCCTGTA 
GCTACAACCT 
ATATCACCGG 
TGTCGATGGA 
TCCCCTACAT 
GCAACGAGCA 



CCCCAACGCC 
GCTGGTCACC 
TTATCTGCGG 
AGACACCAAC 
CACCGTGGGC 
TGAGGGCGAT 
CGTCGCGGTC 
CAACACCGAA 
TATCAACGAT 
GGATGCCCTG 
CGCCGCCGCC 
CGGCAGCAAT 
TTTCAATAAC 
CGACATGCCG 
GATCGCCAAC 
ACTGTCCACT 
TATCGGTATT 
GGCGGAGATG 
GGAGATTGCC 



60 
120 
180 
240 
300 
360 
420 
426 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1164 
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RAW SEQUENCE LISTING DATE: 10/19/2000 

PATENT APPLICATION : US/09/308,207 TIME: 08:34:10 

Input Set : A:\ES.txt 

Output Set: N:\CRF3\10192000\I308207.raw 



251 (B) TYPE: nucleic acid 

252 (C) STRAND EDNESS : single 
2 53 (D) TOPOLOGY: linear 

255 (ii) MOLECULE TYPE: DNA (genomic) 

2 57 (vi) ORIGINAL SOURCE: 
2 58 (A) ORGANISM: GPD1 

C--> 260 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



2 62 CTTTAATTTT CTTTTATCTT ACTCTCCTAC ATAAGACATC AAGAAACAAT TGTATATTGT 60 

2 64 ACACCCCCCC CCTCCACAAA CACAAATATT GATAATATAA AGATGTCTGC TGCTGCTGAT 12 0 

266 AGATTAAACT TAACTTCCGG CCACTTGAAT GCTGGTAGAA AGAGAAGTTC CTCTTCTGTT 180 

2 68 TCTTTGAAGG CTGCCGAAAA GCCTTTCAAG GTTACTGTGA TTGGATCTGG TAACTGGGGT 240 

270 ACTACTATTG CCAAGGTGGT TGCCGAAAAT TGTAAGGGAT ACCCAGAAGT TTTCGCTCCA 300 

272' ATAGTACAAA TGTGGGTGTT CGAAGAAGAG ATCAATGGTG AAAAATTGAC TGAAATCATA 3 60 

274 AATACTAGAC ATCAAAACGT GAAATACTTG CCTGGCATCA CTCTACCCGA CAATTTGGTT 4 20 

276 GCTAATCCAG ACTTGATTGA TTCAGTCAAG GATGTCGACA TCATCGTTTT CAACATTCCA 4 80 

2 78 CATCAATTTT TGCCCCGTAT CTGTAGCCAA TTGAAAGGTC ATGTTGATTC ACACGTCAGA 54 0 

280 GCTATCTCCT GTCTAAAGGG TTTTGAAGTT GGTGCTAAAG GTGTCCAATT GCTATCCTCT 600 

2 82 TACATCACTG AGGAACTAGG TATTCAATGT GGTGCTCTAT CTGGTGCTAA CATTGCCACC 660 

284 GAAGTCGCTC AAGAACACTG GTCTGAAACA ACAGTTGCTT ACCACATTCC AAAGGATTTC 720 

286 AGAGGCGAGG GCAAGGACGT CGACCATAAG GTTCTAAAGG CCTTGTTCCA CAGACCTTAC 780 

288 TTCCACGTTA GTGTCATCGA AGATGTTGCT GGTATCTCCA TCTGTGGTGC TTTGAAGAAC 840 

290 GTTGTTGCCT TAGGTTGTGG TTTCGTCGAA GGTCTAGGCT GGGGTAACAA CGCTTCTGCT 900 

292 GCCATCCAAA GAGTCGGTTT GGGTGAGATC ATCAGATTCG GTCAAATGTT TTTCCCAGAA 960 

294 TCTAGAGAAG AAACATACTA CCAAGAGTCT GCTGGTGTTG CTGATTTGAT CACCACCTGC 1020 

296 GCTGGTGGTA GAAACGTCAA GGTTGCTAGG CTAATGGCTA CTTCTGGTAA GGACGCCTGG 1080 

298 GAATGTGAAA AGGAGTTGTT GAATGGCCAA TCCGCTCAAG GTTTAATTAC CTGCAAAGAA 1140 

300 GTTCACGAAT GGTTGGAAAC ATGTGGCTCT GTCGAAGACT TCCCATTATT TGAAGCCGTA 1200 

302 TACCAAATCG TTTACAACAA CTACCCAATG AAGAACCTGC CGGACATGAT TGAAGAATTA 12 60 

304 GATCTACATG AAGATTAGAT TTATTGGAGA AAGATAACAT ATCATACTTC CCCCACTTTT 1320 

306 TTCGAGGCTC TTCTATATCA TATTCATAAA TTAGCATTAT GTCATTTCTC ATAACTACTT 1380 
309 (2) INFORMATION FOR SEQ ID NO: 6: 



311 (i) SEQUENCE CHARACTERISTICS: 

312 (A) LENGTH: 2946 base pairs 

313 (B) TYPE : nucleic acid 

314 (C) STRANDEDNESS : single 

315 (D) TOPOLOGY: linear 

317 (ii) MOLECULE TYPE: DNA (genomic) 

319 (vi) ORIGINAL SOURCE: 

320 (A) ORGANISM: GPD2 

C--> 322 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



324 GAATTCGAGC CTGAAGTGCT GATTACCTTC AGGTAGACTT CATCTTGACC CATCAACCCC 60 

326 AGCGTCAATC CTGCAAATAC ACCACCCAGC AGCACTAGGA TGATAGAGAT AATATAGTAC 120 

328 GTGGTAACGC TTGCCTCATC ACCTACGCTA TGGCCGGAAT CGGCAACATC CCTAGAATTG 180 

330 AGTACGTGTG ATCCGGATAA CAACGGCAGT GAATATATCT TCGGTATCGT AAAGATGTGA 24 0 

332 TATAAGATGA TGTATACCCA ATGAGGAGCG CCTGATCGTG ACCTAGACCT TAGTGGCAAA 300 

334 AACGACATAT CTATTATAGT GGGGAGAGTT TCGTGCAAAT AACAGACGCA GCAGCAAGTA 3 60 

336 ACTGTGACGA TATCAACTCT TTTTTTATTA TGTAATAAGC AAACAAGCAC GAATGGGGAA 4 20 

338 AGCCTATGTG CAATCACCAA GGTCGTCCCT TTTTTCCCAT TTGCTAATTT AGAATTTAAA 4 80 

340 GAAACCAAAA GAATGAAGAA AGAAAACAAA TACTAGCCCT AACCCTGACT TCGTTTCTAT 540 
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RAW SEQUENCE LISTING 

PATENT APPLICATION : US/09/308,207 



DATE: 10/19/2000 
TIME: 08:34:10 



Input Set : A:\ES.txt 

Output Set: N:\CRF3\10192000\I308207.raw 



342 GATAATACCC TGCTTTAATG AACGGTATGC CCTAGGGTAT ATCTCACTCT GTACGTTACA 600 

344 AACTCCGGTT ATTTTATCGG AACATCCGAG CACCCGCGCC TTCCTCAACC CAGGCACCGC 660 

346 CCCAGGTAAC CGTGCGCGAT GAGCTAATCC TGAGCCATCA CCCACCCCAC CCGTTGATGA 720 

348 CAGCAATTCG GGAGGGCGAA AATAAAACTG GAGCAAGGAA TTACCATCAC CGTCACCATC 780 

3 50 ACCATCATAT CGCCTTAGCC TCTAGCCATA GCCATCATGC AAGCGTGTAT CTTCTAAGAT 84 0 

3 52 TCAGTCATCA TCATTACCGA GTTTGTTTTC CTTCACATGA TGAAGAAGGT TTGAGTATGC 900 

3 54 TCGAAACAAT AAGACGACGA TGGCTCTGCC ATTGGTTATA TTACGCTTTT GCGGCGAGGT 960 

3 56 GCCGATGGGT TGCTGAGGGG AAGAGTGTTT AGCTTACGGA CCTATTGCCA TTGTTATTCC 1020 

3 58 GATTAATCTA TTGTTCAGCA GCTCTTCTCT ACCCTGTCAT TCTAGTATTT TTTTTTTTTT 1080 

3 60 TTTTTGGTTT TACTTTTTTT TCTTCTTGCC TTTTTTTCTT GTTACTTTTT TTCTAGTTTT 114 0 

3 62 TTTTCCTTCC ACTAAGCTTT TTCCTTGATT TATCCTTGGG TTCTTCTTTC TACTCCTTTA 1200 

364 GATTTTTTTT TTATATATTA ATTTTTAAGT TTATGTATTT TGGTAGATTC AATTCTCTTT 1260 

3 66 CCCTTTCCTT TTCCTTCGCT CCCCTTCCTT ATCAATGCTT GCTGTCAGAA GATTAACAAG 132 0 

368 ATACACATTC CTTAAGCGAA CGCATCCGGT GTTATATACT CGTCGTGCAT ATAAAATTTT 1380 

370 GCCTTCAAGA TCTACTTTCC TAAGAAGATC ATTATTACAA ACACAACTGC ACTCAAAGAT 14 4 0 

372 GACTGCTCAT ACTAATATCA AACAGCACAA ACACTGTCAT GAGGACCATC CTATCAGAAG 150 0 

374 ATCGGACTCT GCCGTGTCAA TTGTACATTT GAAACGTGCG CCCTTCAAGG TTACAGTGAT 1560 

376 TGGTTCTGGT AACTGGGGGA CCACCATCGC CAAAGTCATT GCGGAAAACA CAGAATTGCA 1620 

378 TTCCCATATC TTCGAGCCAG AGGTGAGAAT GTGGGTTTTT GATGAAAAGA TCGGCGACGA 1680 

3 80 AAATCTGACG GATATCATAA ATACAAGACA CCAGAACGTT AAATATCTAC CCAATATTGA 174 0 

3 82 CCTGCCCCAT AATCTAGTGG CCGATCCTGA TCTTTTACAC TCCATCAAGG GTGCTGACAT 1800 

3 84 CCTTGTTTTC AACATCCCTC ATCAATTTTT ACCAAACATA GTCAAACAAT TGCAAGGCCA 1860 

3 86 CGTGGCCCCT CATGTAAGGG CCATCTCGTG TCTAAAAGGG TTCGAGTTGG GCTCCAAGGG 1920 

3 88 TGTGCAATTG CTATCCTCCT ATGTTACTGA TGAGTTAGGA ATCCAATGTG GCGCACTATC 1980 

3 90 TGGTGCAAAC TTGGCACCGG AAGTGGCCAA GGAGCATTGG TCCGAAACCA CCGTGGCTTA 2040 
392 CCAACTACCA AAGGATTATC AAGGTGATGG CAAGGATGTA GATCATAAGA TTTTGAAATT 2100 
394 GCTGTTCCAC AGACCTTACT TCCACGTCAA TGTCATCGAT GATGTTGCTG GTATATCCAT 2160 
396 TGCCGGTGCC TTGAAGAACG TCGTGGCACT TGCATGTGGT TTCGTAGAAG GTATGGGATG 2220 
398 GGGTAACAAT GCCTCCGCAG CCATTCAAAG GCTGGGTTTA GGTGAAATTA TCAAGTTCGG 2280 

4 00 TAGAATGTTT TTCCCAGAAT CCAAAGTCGA GACCTACTAT CAAGAATCCG CTGGTGTTGC 234 0 
402 AGATCTGATC ACCACCTGCT CAGGCGGTAG AAACGTCAAG GTTGCCACAT ACATGGCCAA 2400 
4 04 GACCGGTAAG TCAGCCTTGG AAGCAGAAAA GGAATTGCTT AACGGTCAAT CCGCCCAAGG 24 60 
406 GATAATCACA TGCAGAGAAG TTCACGAGTG GCTACAAACA TGTGAGTTGA CCCAAGAATT 2520 
4 08 CCCAATTATT CGAGGCAGTC TACCAGATAG TCTACAACAA CGTCCGCATG GAAGACCTAC 2580 
410 CGGAGATGAT TGAAGAGCTA GACATCGATG ACGAATAGAC ACTCTCCCCC CCCCTCCCCC 2640 
412 TCTGATCTTT CCTGTTGCCT CTTTTTCCCC CAACCAATTT ATCATTATAC ACAAGTTCTA 2700 
414 CAACTACTAC TAGTAACATT ACTACAGTTA TTATAATTTT CTATTCTCTT TTTCTTTAAG 2760 
416 AATCTATCAT TAACGTTAAT TTCTATATAT ACATAACTAC CATTATACAC GCTATTATCG 2820 
418 TTTACATATC ACATCACCGT TAATGAAAGA TACGACACCC TGTACACTAA CACAATTAAA 2880 
420 TAATCGCCAT AACCTTTTCT GTTATCTATA GCCCTTAAAG CTGTTTCTTC GAGCTTTTCA 2940 
422 CTGCAG 2946 
424 (2) INFORMATION FOR SEQ ID NO : 7: 

426 (i) SEQUENCE CHARACTERISTICS: 

427 (A) LENGTH: 3178 base pairs 

428 (B) TYPE: nucleic acid 

429 (C) STRANDEDNESS : single 

430 (D) TOPOLOGY: linear 

4 32 (ii) MOLECULE TYPE: DNA (genomic) 

434 (vi) ORIGINAL SOURCE: 
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VERIFICATION SUMMARY DATE: 10/19/2000 

PATENT APPLICATION: US/09/308,207 TIME: 08:34:11 

Input Set : A:\ES.txt 

Output Set: N:\CRF3\10192000\l308207.raw 

L:30 M:220 C: Keyword misspelled or invalid format, [(A) APPLICATION NUMBER: ] 

L:31 M:220 C: Keyword misspelled or invalid format, [(B) FILING DATE:] 

L:35 M:220 C: Keyword misspelled or invalid format, (A) APPLICATION NUMBER: 

L:36 M:220 C: Keyword misspelled or invalid format, (B) FILING DATE: 

L:68 M:220 C: Keyword misspelled or invalid format, [(xi) SEQUENCE DESCRIPTION : SEQ ID NO:] 



L: 


139 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format , 


[ 


(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


] 


L: 


174 


M 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format, 


[ 


(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


] 


L: 


205 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format , 


[ 


(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


] 


L: 


260 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format , 


[ 


(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


] 


L: 


322 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format , 


[ 


(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


] 


L: 


437 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format , 


[ 


(Xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


] 


L: 


558 


M 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format , 


[ 


(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


] 


L: 


601 


M:220 


C: 


Keyword 


misspelled 


or 


invalid 


format , 


[ 


(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


1 


L: 


642 


M:220 


C: 


Keyword 


misspelled 


or 


invalid 


format , 


[ 


(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


] 


L: 


741 


M 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format , 


[ 


(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


] 


L: 


831 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format, 


[ 


(Xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


] 


L: 


918 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format, 


[ 


(Xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


] 



L 


1050 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L 


1131 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L 


1242 


M 


220 


C ; 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L 


1359 


M 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L 


1422 


M 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L 


1572 


M 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L 


1990 


M 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format , 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L 


2006 


M 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L 


2020 


M 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format , 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L 


2034 


M 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format , 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L 


2048 


M 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format , 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L 


2062 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format , 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L. 


2076 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L. 


2090 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L: 


2104 


M 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L: 


2118 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L: 


2132 


M:220 


C: 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L : 


2146 


M 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L: 


2160 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format, 


t(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L: 


2177 


M 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L: 


2243 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L: 


2363 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L: 


2417 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L : 


2459 


M 


220 


C : 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L: 


254 6 


M 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L: 


2560 


M 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L:2574 


M.-220 


C: 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L:2588 


M 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L: 


2602 


M: 


220 


C: 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L: 


2616 


M:220 


C: 


Keyword 


misspelled 


or 


invalid 


format, 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 


L: 


2630 


M:220 


C: 


Keyword 


misspelled 


or 


invalid 


format , 


[(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ 


ID 


NO 
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VERIFICATION SUMMARY 

PATENT APPLICATION : US/09/308,207 



DATE: 10/19/2000 
TIME: 08:34:11 



Input Set : A:\ES.txt 

Output Set: N:\CRF3\10192000\I308207.raw 



L 


2644 


M 


220 


C 


Keyword 


L 


2659 


M 


220 


C 


Keyword 


L 


2832 


M 


246 


w 


Invalid 


L 


2958 


M 


246 


w 


Invalid 


L 


2997 


M 


246 


w 


Invalid 


L 


3036 


M 


246 


w 


Invalid 


L 


3072 


M 


246 


w 


Invalid 


L 


3108 


M 


246 


w 


Invalid 


L 


3153 


M 


246 


w 


Invalid 


L 


3272 


M 


246 


w 


Invalid 


L 


3467 


M 


336 


w 


Invalid 



misspelled or invalid format, 
misspelled or invalid format, 



[(xi) 
[(xi) 



SEQUENCE DESCRIPTION: 
SEQUENCE DESCRIPTION: 



SEQ 
SEQ 



value of Alpha Sequence Header Field, [MOLECULE TYPE 

value of Alpha Sequence Header Field, [MOLECULE TYPE 

value of Alpha Sequence Header Field, [MOLECULE TYPE 

value of Alpha Sequence Header Field, [MOLECULE TYPE 

value of Alpha Sequence Header Field, [MOLECULE TYPE 

value of Alpha Sequence Header Field, [MOLECULE TYPE 

value of Alpha Sequence Header Field, [MOLECULE TYPE 

value of Alpha Sequence Header Field, [MOLECULE TYPE 
Amino Acid Number in Coding Region, SEQ ID: 68 



SeqNo^ 
SeqNo= 
SeqNO 1 
SeqNo= 
SeqNo^ 
SeqNo= 
SeqNo= 
SeqNo : 



ID NO:] 
ID NO: J 

59, Value= 

60, Value= 

61, Value= 

62, Value= 

63, Value= 

64, Value= 

65, Value= 
67, Value= 



[None) 
[None] 
[None] 
[None] 
[None] 
[None] 
[None] 
[None] 
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